Behavior control apparatus and method

ABSTRACT

The invention relates to a behavior control apparatus and method for autonomously controlling a mobile unit based on visual information in practical application without the needs of a greatdeal of preparation or computational cost and limiting the type of target object. According to one aspect of the invention, a method for controlling behavior of a mobile unit using behavior command is provided. First, sensory inputs are captured and then the motion of the mobile unit is estimated. The portion which includes a target object to be target for behavior of the mobile unit is segregated from the sensory inputs. The target objects extracted from the segregated portion and the location of the target object is acquired. Finally, the mobile unit is controlled based on the location of target object.

TECHNICAL FIELD

The present invention relates to a behavior control apparatus and methodfor mobile unit, in particular, to a behavior control apparatus andmethod for recognizing a target object in acquired images andcontrolling behavior of the mobile unit with high accuracy based on therecognized target object.

BACKGROUND ART

To control a mobile unit with high accuracy based on input images, it isnecessary for a control system to recognize an object in the image as atarget for behavior of the mobile unit. One approach is that the controlsystem learns training data pre-selected by an operator prior torecognition. Specifically, the control system searches the input imagesto extract some shapes or colors therefrom designated as features of thetarget. Then, the control system outputs commands to make the mobileunit move toward the extracted target.

However, it is necessary for the operator to teach features such asshape or color of the target in detail to the control system andtherefore preparation for that is a burden in terms of time and labor.In addition, since the control would be interrupted when the target goesoff the input image, it is difficult to apply this approach to practicaluse.

An alternative approach is that a template for the target is preparedand during controlling the mobile unit the template is always applied toinput images to search and extract shape and location of the target indetail. In this case, however, computational cost would become hugebecause a computer has to keep calculating the shape and location of thetarget. Furthermore, the calculation of searching the target may fallinto a local solution.

Therefore, to control behavior of the mobile unit efficiently andflexibly, it is preferable to make the mobile unit move autonomouslyrather than utilizing supervised learning method as a target isspecified beforehand. To achieve that, a method for recognizing thetarget autonomously and learning the location of the target is needed.In Japanese Patent Application Unexamined Publication (Kokai) No.H8-126981, image position recognition method in robot system isdisclosed. According to the method, the target object is searched outautonomically even when the target object is missing out of input imagedue to the error. However, the method requires that work plane forrecognizing images is painted with various colors prior to working,which is substantially time-consuming task.

In Japanese Patent Application Unexamined Publication (Kokai) No.H7-13461, a method for leading autonomous moving robots for managingindoor air-conditioning units is disclosed. According to the method, atarget object for leading is detected through image processing and therobot is leaded toward the target. However, the method needs blowingoutlets of air-conditioning units as target objects, which lacksgenerality.

Therefore, it is objective of the present invention to provide abehavior control apparatus and method for autonomously controlling amobile unit based on visual information in practical application withoutthe needs of a great deal of preparation or computational cost andlimiting the type of target object.

DISCLOSURE OF INVENTION

According to one aspect of the invention, a behavior control apparatusfor controlling behavior of a mobile unit is provided. The apparatuscomprises sensory input capturing method for capturing sensory inputsand motion estimating method for estimating motion of the mobile unit.The apparatus further comprises target segregation method forsegregating the portion which includes a target object to be target forbehavior of the mobile unit from sensory inputs, and target objectmatching method for extracting the target object from the segregatedportion. The apparatus still further comprises target location acquiringmethod for acquiring the location of the target object and behaviordecision method for deciding behavior command for controlling the mobileunit based on the location of the target object.

The behavior control apparatus roughly segregate the portion thatincludes a target object of behavior from sensory inputs, such asimages, based on the estimation of motion. The apparatus then specifiesa target object from the portion, acquires location of the target objectand output behavior command which moves the mobile unit toward thelocation. Thus, detailed feature of the target object need not bepredetermined. In addition, because the features irrelevant to presentbehavior are eliminated, the computational load is reduced. Therefore,highly efficient and accurate control for the mobile unit may beimplemented.

As used herein, “mobile unit” refers to a unit which has a drivingmechanism and moves in accordance with behavior commands.

The sensory inputs may be images of the external environment of themobile unit.

The motion estimating method comprises behavior command output methodfor outputting the behavior command and behavior evaluation method forevaluating the result of the behavior of the mobile unit. The motionestimating method further comprises learning method for learning themotion of the mobile unit using the relationship between the sensoryinputs and the behavior result and storing method for storing thelearning result.

The behavior control apparatus pre-learns the relationship betweensensory inputs and behavior commands. Then the apparatus updates thelearning result when new feature is acquired on behavior control stage.The learning result is represented as probabilistic densitydistribution. Thus, motion of the mobile unit on behavior control stagemay be estimated with high accuracy.

The motion of the mobile unit may be captured using a gyroscope insteadof estimating it.

The target segregation method segregates the portion by comparing thesensory inputs and the estimated motion using such as optical flow. Thusthe behavior control apparatus may roughly segregate the portion thatincludes a target object

The target location acquiring method defines the center of the targetobject as the location of the target object and the behavior decisionmethod outputs the behavior command to move the mobile unit toward thelocation of the target object. Thus the mobile unit may be controlledstably.

The behavior decision method calculates the distance between the mobileunit and the location of the target object, and deciding the behaviorcommand to decrease the calculated distance. This calculation is verysimple and helps to reduce the amount of computation.

If the calculated distance is greater than a predetermined value, thetarget segregation method repeats segregating the portion which includesa target object.

The target object matching method extracts the target object by patternmatching between the sensory inputs and predetermined templates. Thusthe target object may be extracted more accurately.

Other embodiments and features will be apparent by reference to thefollowing description in connection with the accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows overall view of a radio-controlled (RC) helicopteraccording to one embodiment of the invention;

FIG. 2 is a functional block diagram illustrating one exemplaryconfiguration of a behavior control apparatus according to theinvention;

FIG. 3 is a graph illustrating the relationship between a generativemodel and minimum variance;

FIG. 4 shows a conceptual illustration of a target object recognized bymeans of target segregation;

FIG. 5 is a chart illustrating that the range of the target object isnarrowed by learning;

FIG. 6 is a flowchart illustrating control routine of a RC helicopter;

FIG. 7 is a chart illustrating a distance between a target location andcenter of motion;

FIG. 8 is a graph illustrating unstable control status of the mobileunit on initial stage of behavior control;

FIG. 9 is a graph illustrating that the vibration of motion of themobile unit is getting smaller; and

FIG. 10 is a graph illustrating stable control status of the mobile uniton last stage of behavior control.

BEST MODE FOR CARRYING OUT THE INVENTION

Preferred embodiments of the present invention will be described asfollows with reference to the drawings.

A behavior control apparatus according to the invention recognizes atarget object, which is a reference for controlling a mobile unit, frominput images and then controls behavior of the mobile unit based on therecognized target object. The apparatus is used as installed on themobile unit, which has driving mechanism and is movable by itself.

Configuration

FIG. 1 shows a radio-controlled (RC) helicopter 100 according to oneembodiment of the invention. The RC helicopter 100 consists of body 101,main rotor 102 and tail rotor 103. On the body 101 are installed a CCDcamera 104, a behavior control apparatus 105 and a servomotor 106. Atthe base of the tail rotor 103, there is link mechanism 107, which iscoupled with the servomotor 106 through a rod 108. The RC helicopter 100can float in the air by rotating the main rotor 102 and the tail rotor103.

The CCD camera 104 takes images of frontal vision of the RC helicopter.Area taken by the camera is showed in FIG. 1 as visual space 109. Thebehavior control apparatus 105 autonomically recognizes a location of atarget object 110 (hereinafter simply referred to as “target location110”), which is to be a target for behavior control, and also recognizesself-referential point in the visual space 109 based on the image takenby the CCD camera 104. The target location 110 is represented asprobabilistic density distribution, as described later, and isconceptually illustrated as ellipse in FIG. 1.

The RC helicopter 100 is tuned as that only the control of yaworientation (as an arrow in FIG. 1, around the vertical line) isenabled. Therefore, term “stable” as used herein means that vibration ofthe RC helicopter's directed orientation is small.

The behavior control apparatus 105 outputs behavior commands to move theself-referential point (for example, center. This is hereinafterreferred to as COM 111, acronym of “center of motion”) of the imagecaptured by CCD camera 104 (the visual space 109) toward the targetlocation 110 in order to control the RC helicopter 100 stably. Thebehavior commands are sent to the servomotor 106. In response to thebehavior commands, the servomotor 106 drives the rod 108, activating thelink mechanism to alter the angle of tail rotor 103 so as to rotate theRC helicopter 100 in yaw orientation.

In the embodiment described above, controllable orientation is limitedin one-dimensional operation such that COM moves from side to side forthe purpose of simple explanation. However, the present invention may bealso applied to position control in two or three dimensions.

Although the RC helicopter 100 is described as an example of the mobileunit having the behavior control apparatus of the present invention, theapparatus may be installed on any of mobile unit having drivingmechanism and being able to move by itself. In addition, the mobile unitis not limited to flying objects like a helicopter, but includes, forexample, vehicles traveling on the ground. The mobile unit furtherincludes the unit only the part of which can moves. For example, thebehavior control apparatus of the present invention may be installed onindustrial robots of which base is fixed to floor, to recognize anoperation target of the robot.

FIG. 2 is a functional block diagram of the behavior control apparatus105. The behavior control apparatus 105 comprises an image capturingblock 202, a behavior command output block 204, a behavior evaluationblock 206, a learning block 208, a storage block 210, a targetsegregation block 212, a matching block 214, a target location acquiringblock 216 and a behavior decision block 218. The behavior controlapparatus 105 may be implemented by running a program according to thepresent invention on a general-purpose computer, and it can also beimplemented by means of hardware having functionality of the invention.

The behavior control apparatus 105 first learns relationship betweenfeatures of inputs (e.g., images taken by the CCD camera 104) andbehavior of the mobile unit. These operations are inclusively referredto as “learning stage”. Completing the learning stage, the apparatus mayestimate motion of the mobile unit based on the captured images usinglearned knowledge. The apparatus further searches and extracts targetlocation in the image autonomously using estimated motion. Finally, theapparatus controls the motion of the mobile unit with the reference tothe target location. These operations are inclusively referred to as“behavior control stage”.

It should be noted that the behavior control apparatus 105 shown in FIG.2 is configured for use on the RC helicopter 100, and the apparatus maybe configured in various manner depending on the characteristic of themobile unit installed thereon. For example, the apparatus may furtherinclude a gyroscope sensor. In this case, the apparatus uses the signalsgenerated from the gyroscope sensor to estimate motion of the mobileunit, and uses the sensory input captured by the image capturing block202 only for recognizing the target location.

Learning

In learning stage, while moving the mobile unit, the behavior controlapparatus 105 learns relationship between features of input images takenby an image pickup device and behavior result in response to behaviorcommand from the behavior command output block 204. The apparatus thenstores learning result in the storage block 210. This learning enablesthe apparatus to estimate motion of the mobile unit accurately based oninput images in the behavior control stage described later.

The image capturing block 202 receives images every predeterminedinterval from an image pickup device such as CCD camera 104 installed infront of the RC helicopter 100. Then the block 202 extracts features assensory inputs I_(i)(t) (i=1,2, . . . ) from the images. This featureextraction may be implemented by any of prior-art approaches such asoptical flow. The extracted features are sent to the behavioralevaluation block 206.

The behavior command output block 204 outputs behavior commandsQ_(i)(t), which directs behavior of the mobile unit. While the learningis immature in initial stage, behavior commands are read from commandsequence which is selected randomly beforehand. During the mobile unitmoves randomly, the behavior control apparatus 105 may learn necessaryknowledge for estimating the motion of the mobile unit. As for the RChelicopter 100 shown in FIG. 1, the behavior commands correspond todriving current of the servomotor 106, which drives link mechanism 107to change the yaw orientation. The behavior command is sent to drivingmechanism such as the servomotor 106 and the behavior evaluation block206. The relationship between the sensory inputs I_(i)(t) and thebehavior commands Q_(i)(t) is represented by the following mapping ƒ.ƒ:I _(i)(t)

Q _(i)(t)  (1)where subscript i (i=1,2, . . . ) means i-th data. For example, themapping ƒ may be given as a non-linear approximation translation usingwell-known Fourier series or the like.

In alternative embodiment, the behavior command output block 204receives signal from an external device and outputs behavior commands inaccordance with the signal.

The behavior evaluation block 206 generates reward depending on bothsensory inputs I_(i)(t) from image capturing block 202 and the behaviorresult in response to behavior command Q_(i)(t) based on predeterminedevaluation function under a reinforcement learning scheme. The exampleof the evaluation function is a function that yields reward “1” when themobile unit controlled by behavior command is stable, otherwise yieldsreward “2”. After the rewards are yielded, the behavior evaluation block206 generates a plurality of columns 1,2,3, . . . , m as many as thenumber of type of the rewards and distributes behavior commands intoeach column responsive to the type of their rewards. Hereinafter thebehavior commands Q_(i)(t) distributed in column l are denoted as “Q_(i)^(l)(t)”. Sensory inputs I_(i)(t) and behavior command Q_(i)(t) aresupplied to learning block 208 and used for learning the relationshipbetween them.

The purpose of the evaluation function is to minimize the variance ofthe behavior commands. In other words, the reinforcement learningsatisfying σ(Q¹)<σ(Q²) is executed with the evaluation function. Theminimum variance of the behavior commands needs to be reduced for smoothcontrol. Learning with the evaluation function allows the behaviorcontrol apparatus 105 to eliminate unnecessary sensory inputs and tolearn important sensory inputs selectively.

In each column, both sensory inputs and the behavior commands are storedaccording to the type of rewards given to the behavior commands.

Each column 1,2,3, . . . ,m corresponds to a cluster model of thebehavior commands. Each column is used to calculate generative modelsg(Ω_(l)) where l denotes the number of attention classes applied.Generative model is a storage model generated through learning, and maybe represented by probabilistic density function in statistic learning.Non-linear estimation such as neural network may be used to modelg(Ω_(l)), which gives the estimation of probabilistic densitydistribution P(Q|Ω_(l)). In the present embodiment, it is assumed thatP(Q|Ω_(l)) takes the form of Gaussian mixture model, which may makeapproximation for any of probabilistic density function. FIG. 3 showsthe relationship between the number of generative models (horizontalaxis) and the minimum variance (vertical axis).

Only one column will not accelerate the convergence of the learningbecause, if so, it will take much time until the normal distributioncurve of behavior commands stored in the column is sharpened and thevariance gets small. In order to control the mobile unit stably, itneeds to learn in such a way that the variance of normal distribution ofmotor output becomes smaller. One feature of the invention is that thenormal distribution curve is sharpened rapidly since a plurality ofcolumns are generated. A method utilizing such minimum variance theoryis described in Japanese Patent Application Unexamined Publication(Kokai) No. 2001-028758.

Then the learning process described later is executed in the learningblock 208. After the learning process is completed, a behavior commandfor minimizing the variance of the normal distribution curve of behaviorcommands for a new sensory input may be selected out of the column bymeans of a statistical learning scheme, and the rapid stability of themobile unit may be attained.

Now the learning process at the learning block 208 will be described indetail.

The learning block 208 calculates the class of attention Ω_(l)corresponding one by one to each column l which contains the behaviorcommands using identity mapping translation. This translation isrepresented by the following mapping h.h:Q _(i)(t)

Ω_(l)(t)  (2)

The purpose of the class of attention Ω_(l) is efficient learning byfocusing on the particular sensory inputs from massive sensory inputswhen new sensory inputs are given. Generally, the amount of sensoryinputs far exceeds the processing capacity of the computer. Thus,appropriate filtering for sensory inputs with the classes of attentionΩ_(l) improves the efficiency of the learning. Therefore, the learningblock 208 may eliminate the sensory inputs except the selected smallsubset of them.

When the learning goes forward, the learning block 208 may know directlythe class of attention corresponding to the sensory input using thestatistical probability without calculating the mapping f and/or h oneby one. More specifically, each of the classes of attention Ω_(l) is aparameter for modeling the behavior commands Q_(i) ^(l)(t) stored ineach column using the probabilistic density function of the normaldistribution. To obtain the probabilistic density function, a mean μ andcovariance Σ need to be calculated for the behavior commands Q_(i)^(l)(t) stored in each column. This calculation is performed byunsupervised Expectation Maximization (EM) algorithm using clusteredcomponent algorithm (CCA), which will be described later. It should benoted that the classes of attention Ω_(l) are modeled on the assumptionthat true probabilistic distribution p(I_(i)(t)|Ω_(l)) will exists foreach class of attention Ω_(l).

Using the obtained parameters, probabilistic density function of eachclass of attention Ω₁ may be obtained. The obtained density functionsare used as prior probability {overscore (p)}(Ω_(l)(t))(={overscore(p)}(Q_(l)(t)|Ω_(l)(t))) of each class of attention before sensoryinputs are given. In other words, each class of attention Ω_(l) isassigned as an element of the probabilistic density function p(Q_(i)^(l)(t)|Ω_(l),θ).

After the classes of attention Ω_(l) are calculated, the learning block208 learns the relation between the sensory inputs and the classes ofattention by means of supervised learning scheme using neural network.More specifically, this learning is executed by obtaining conditionalprobabilistic density function p_(λ)(I_(i)(t)|Ω_(l)) of the class ofattention Ω₁ and the sensory input I_(i)(t) using hierarchical neuralnetwork with the class of attention Ω_(l) as supervising signal. Itshould be noted that the class of attention may be calculated bysynthetic function f·h . The obtained conditional probabilistic densityfunction p (I_(i)(t)|Ω_(l)) corresponds to the probabilistic relationbetween the sensory input and the class of attention.

New sensory inputs gained by CCD camera 104 are provided to the behaviorcontrol apparatus 105 after the learning is over. The learning block 208selects the class of attention corresponding to provide sensory inputusing statistical learning scheme such as bayes' learning. Thisoperation corresponds to calculating conditional probabilistic densityfunction p(Ω_(l)|I_(i)(t)) of the class of attention Ω_(l) relative tothe sensory inputs I_(i)(t). As noted above, since the probabilisticdensity function of the sensory inputs and the class of attention hasbeen already estimated by the hierarchical neural network, newly givensensory inputs may be directly assigned to particular class ofattention. In other words, after the supervised learning with neuralnetwork is over, calculation of the mapping ƒ and/or h becomeunnecessary for selecting class of attention Ω_(l) relative to sensoryinput I_(i)(t).

In this embodiment, bayes' learning scheme is used as the statisticallearning scheme. Assume that sensory inputs I_(i)(t) are given and bothprior probability {overscore (p)}(Ω_(l)(t)) and probabilistic densityfunction p(I_(i)(t)|Ω_(l)) have been calculated beforehand. Maximumposterior probability for each class of attention is calculated byfollowing bayes' rule.

$\begin{matrix}{{p\left( {\Omega_{l}(t)} \right)} = \frac{{\overset{\_}{p}\left( {\Omega_{l}(t)} \right)}{p\left( {I_{i}(t)} \middle| {\Omega_{l}(t)} \right)}}{\sum{{\overset{\_}{p}\left( {\Omega_{k}(t)} \right)}{p\left( {I_{i}(t)} \middle| {\Omega_{k}(t)} \right)}}}} & (3)\end{matrix}$

The p(Ω_(l)(t)) may be called the “belief” of Ω_(l) and is theprobability that a sensory input I_(i)(t) belongs to a class ofattention Ω₁(t). Calculating the probability that a sensory inputI_(i)(t) belongs to a class of attention Ω_(l) using bayes' rule impliesthat one class of attention Ω_(l) can be identified selectively byincreasing the belief (weight) by learning of bayes' rule.

The class with highest probability (belief) is selected as class ofattention Ω_(l) corresponding to the provided sensory input I_(i)(t).Thus, the behavior control apparatus 105 may obtain the class ofattention Ω_(l) that is hidden parameter from directly observablesensory input I_(i)(t) using bayes' rule and to assign the sensory inputI_(i)(t) to corresponding class of attention Ω_(l).

The learning block 208 further searches behavior command according tothe sensory input stored in the column corresponding to the selectedclass of attention, then send the searched behavior command to thetarget segregation block 212.

As noted above, using the blocks 204–210, the behavior control apparatusmay estimate motion of the mobile unit accurately based on input images.Therefore, these blocks are inclusively referred to as “motionestimating method” in appended claims.

Behavior Control

On behavior control stage, the behavior control apparatus 105 estimatesthe motion based on input image and roughly segregates the location ofthe target object (target location). Then the apparatus performs patternmatching with templates which are stored in the memory as target objectand calculate the target location more accurately. And the apparatusindicates to output the behavior command based on the distance betweenthe target location and center of motion (COM). By repeating thisprocess, the target location is getting refined and the mobile unitreaches in stably controlled status. In other words, the apparatussegregates the target based on motion estimation and understands what isto be target object.

Now the functionality of each block is described.

Target segregation block 212 roughly segregates and extracts a potionincluding target object, which are to be the behavior reference of themobile unit, from visual space. For example, the segregation is done bycomparing optical flow of the image and the estimated motion.

Target object matching block 214 uses templates to extract the targetobject more accurately. The target object matching block 214 comparesthe template and the segregated portion and determines whether theportion is the object to be targeted or not. The templates are preparedbeforehand. If there are plurality of target objects, or if there areplurality of objects which match with the templates, the object havinglargest matching index is selected.

A target location acquiring block 216 defines the center point of thetarget object as the target location.

When the target location is defined, behavior decision block 218supplies request signal to behavior command output block 204. When therequest signal is received, behavior command output block 204 outputsthe behavior command to move such that center of motion (COM) of themobile unit overlaps the location of the target object.

It is indispensable for determining behavior command autonomously tosegregate the target and non-target. The reason is because a targetobject segregated by target segregation may be used to select theoptimal behavior to control the target object toward the location. Inother words, the actual most suitable behavior is selected by predictingcenter of motion (COM) based on selective attention. Thus it allows thebehavior control apparatus to search the location of the target objectaccurately in captured image. FIG. 4 is a diagram illustrating a targetobject segregation recognized by the target segregation block 212.Ellipses 401, 402, 403 are the cluster to be the location of the targetobject calculated based on the estimated motion and represented asnormal distribution Ω₁, Ω₂, Ω₃, respectively. These are attentionclasses extracted from feature information of the image. Mixturedistribution of three normal distribution model Ω₁, Ω₂, Ω₃ are showed asa dotted-lined ellipse in FIG. 4. Center of motion is acquired as centerof mixture distribution in visual space. Each gaussian distribution invisual space is produced by projecting clustered behavior space based oncenter of motion on visual space with non-linear mapping like neuralnetwork.

Assuming that Ω_(TL) represents the target location and σ represents thearea where segregating may be executed in captured image, the locationof the target object is modeled by probability density functionP(Ω_(TL)|σ). Since the location Ω_(TL) is basically uncertain value, itis assumed that the location has behavior control noise (that is, thevariance of probabilistic density distribution). By repeating feedbackprocess, noise (variance) of the target location is reduced and refined.In the present invention, reduction of noise (variance) depends on theaccuracy of the motion estimation of the mobile unit.

FIG. 5 is a chart illustrating that range of the target location isrefined (reduced) by the learning. Learning block 208 narrows downuncertain probability range (in other words, variance of probabilisticdensity distribution) σ of the location of the target location by, forexample, bayes' learning.

CCA Reinforced EM Algorithm

Now CCA reinforced EM algorithm is described in detail.

The EM algorithm is an iterative algorithm for estimating the maximumlikelihood parameter when observed data can be viewed as incompletedata. When the observed data is the normal distribution, the parameter θis represented by θ(μ, Σ).

In one embodiment of the invention, the model of feature vector is builtby means of bayes' parameter estimation. This is employed to estimatethe number of clusters which represents data structure best.

Algorithm to estimate a parameter of Gaussian mixture model will bedescribed. This algorithm is similar to conventional clusteringessentially, but is different in that it can estimate parameters closelywhen clusters are overlapped. Therefore, sample of training data is usedto determine the number of subclass and the parameters of each subclass.

Let Y be an M dimensional random vector to be modeled using a Gaussianmixture distribution. Assume that this model has K subclasses. Thefollowing parameters are required to completely specify the k-thsubclass.

-   -   π_(k): the probability that a pixel has subclass k    -   μ_(k): the M dimensional spectral mean vector for subclass k    -   R_(k): the M times M spectral covariance matrix for subclass k

π, μ, R denote the following parameter sets, respectively.

$\begin{matrix}{\left\{ \pi_{k} \right\}_{k = 1}^{K},\left\{ \mu_{k} \right\}_{k = 1}^{K},\left\{ R_{k} \right\}_{k = 1}^{K}} & (4)\end{matrix}$

The complete set of parameters for the class are then given by K andθ=(π, μ,R). Note that the parameters are constrained in a variety ofways. In particular, K must be an integer greater than 0, π_(k)≧0 withΣπ_(k)=1, and det(R)≧ε, where might be chosen depending on theapplication. The set of admissible θ for a k-th order model is denotedby ρ.

Let Y₁, Y₂, . . . , Y_(n) be N multispectral pixels sampled from theclass of interest. Moreover, assume that the subclass of that pixel isgiven by the random variable X_(n) for each pixel Y_(i). Certainly,Ω_(n) is normally not known, and which can also be useful for analyzingthe problem.

Letting each subclass be a multivariate Gaussian distribution, theprobability density function for the pixel Y_(n) for Ω_(n)=k is given by

$\begin{matrix}\begin{matrix}{{p_{y_{n}|x_{n}}\left( {\left. y_{n} \middle| k \right.,\theta} \right)} = {\frac{1}{\left( {2\pi} \right)^{M/2}}{R_{k}}^{{- 1}/2}}} \\{\exp\left\{ {{- 0.5}\left( {y_{n} - \mu_{k}} \right)^{t}{R_{k}^{- 1}\left( {y_{n} - \mu_{k}} \right)}} \right\}}\end{matrix} & (5)\end{matrix}$

Since the subclass Ω_(n) of each sample is not known, to compute thedensity function of Y_(nm) for given parameter θ, the followingdefinition of conditional probability is applied.

$\begin{matrix}{{p_{y_{n}}\left( y_{n} \middle| \theta \right)} = {\sum\limits_{k = 1}^{K}{{p_{y_{n}|x_{n}}\left( {\left. y_{n} \middle| k \right.,\theta} \right)}\pi_{k}}}} & (6)\end{matrix}$

The logarithm of the probability of the entire sequence

$\begin{matrix}{Y = \left\{ Y_{n} \right\}_{n = 1}^{N}} & (7)\end{matrix}$is as follows.

$\begin{matrix}{{\log\;{p_{y}\left( {\left. y \middle| K \right.,\theta} \right)}} = {\sum\limits_{n = 1}^{N}{\log\left( {\sum\limits_{k = 1}^{K}{{p_{y_{n}|x_{n}}\left( {\left. y_{n} \middle| k \right.,\theta} \right)}\pi_{k}}} \right)}}} & (8)\end{matrix}$

The objective is then to estimate the parameters K and θ∈ρ^((K)).

Minimum description length (MDL) estimator works by attempting to findthe model order which minimizes the number of bits that would berequired to code both the data samples y_(n) and the parameter vector θ.MDL reference is expressed like the following expression.MDL(K,θ)=−log p _(y)(y|K,θ)+2L log(NM)  (9)

Therefore, the objective is to minimize the MDL criteria

$\begin{matrix}\begin{matrix}{{{MDL}\left( {K,\theta} \right)} = {{- {\sum\limits_{n = 1}^{N}{\log\left( {\sum\limits_{i = 1}^{K}{{p_{y_{n}|x_{n}}\left( {\left. y_{n} \middle| k \right.,\theta} \right)}\pi_{k}}} \right)}}} +}} \\{\frac{1}{2}L\;{\log({NM})}}\end{matrix} & (10)\end{matrix}$

In order to derive the EM algorithm update equations, it is required tocompute the following equation (Expectation step)

$\begin{matrix}{{Q\left( {\theta;\theta^{(i)}} \right)} = {{E\left\lbrack {{{{\log\;{p_{y,x}\left( {y,{X❘\theta}} \right)}}❘Y} = y},\theta^{(i)}} \right\rbrack} - {\frac{1}{2}L\;{\log({NM})}}}} & (11)\end{matrix}$where Y and X are the sets of random variables

$\begin{matrix}{\left\{ Y_{n} \right\}_{n = 1}^{N},\left\{ X_{n} \right\}_{n = 1}^{N}} & (12)\end{matrix}$respectively, and y and x are realizations of these random objects.

Thus the following equation holds.MDL(K,θ)−MDL(K,θ^((i)))<Q(θ^((i));θ^((i)))−Q(θ;θ^((i)))  (13)

This results in a useful optimization method since any value of θ thatincreases the value of Q(θ;θ^((i))) is guaranteed to reduce the MDLcriteria. The objective of the EM algorithm is hereby to iterativelyoptimize with respect to θ until a local minimum of the MDL function isreached.

The Q function is optimized in the following way.Q(E,π;E ^((i)),π^((i)))=E[log p _(y,x)(Y,X|E,π)|y,E ^((i)), π^((i)) ]−KMlog(NM)  (14)

In this case,

$\begin{matrix}\begin{matrix}{Q \approx {{\sum\limits_{k = 1}^{K}\left\{ {{{- \frac{1}{2}}{{tr}\left( {P_{k}{\overset{\_}{R}}_{k}} \right)}} - {\frac{\left( {M - 1} \right){\overset{\_}{N}}_{k}}{2}{\log\left( {2\pi} \right)}} + {{\overset{\_}{N}}_{k}\log\;\pi_{k}}} \right\}} -}} \\{{KM}\;{\log({NM})}}\end{matrix} & (15)\end{matrix}$where

$\begin{matrix}{{{\overset{-}{N}}_{k} = {\sum\limits_{n = 1}^{N}{p_{x_{n}|y_{n}}\left( {\left. k \middle| y_{n} \right.,E^{(i)},\pi^{(i)}} \right)}}}{{\overset{-}{R}}_{k} = {\sum\limits_{n = 1}^{N}{y_{n}y^{t}{p_{x_{n}|y_{n}}\left( {\left. k \middle| y_{n} \right.,E^{(i)},\pi^{(i)}} \right)}}}}} & (16)\end{matrix}$

The EM update equations then are following.(E ^((i+1)),π^((i+1)))=argmin_(E,π) Q(E,π;E ^((i)),π^((i)))  (17)The solution is given as follows.e _(k) ^((i+1))=principal eigenvector {overscore (R)}_(k)π_(k)^((i+1))={overscore (N)}_(k) |N  (18)

Initially, the number K of subclasses will be started with sufficientlylarge, and then be decremented sequentially. For each value of K, the EMalgorithm is applied until it is converged to a local maximum of the MDLfunction. Eventually, the value of K may be selected simply andcorresponding parameters that resulted in the largest value for the MDLcriteria may be selected.

One method to effectively reduce K is to constrain the parameters of twoclasses to be equal, such that e_(l)=e_(m) for classes l and m.Moreover, letting E* and E*_(lm) be the unconstrained and constrainedsolutions to Eq (17), a distance function may be defined as follows.d(l,m)=Q(E*,π*;E ^((i)),π^((i)))−Q(E* _(l,m) ,π*;E^((i)),π^((i)))=σ_(max)(R _(l)) σ_(max)(R _(m))−σ_(max)(R _(l) +R_(m))≧0  (19)where σ_(max)(R) denotes the principal eigenvalue of R. At each step,the two components that minimized the class distance are computed.(l*,m*)=argmin_(l,m) d(l,m)  (20)

After all, the two classes are merged and the number of subclass K isdecreased.

Process of Behavior Control Apparatus

It should be noted that the learning stage and behavior control stageare not also divided clearly, but both of them may be executedsimultaneously as one example described bellow.

In other words, behavior evaluation block 206 determines whether featureof image provided afresh should be reflected to knowledge acquired byprevious learning in behavior control stage. Furthermore, behaviorevaluation block 206 receives the motion estimated from the image. Whenchange of the external environment that was not learned in previouslearning is captured by image capturing block 202, the feature is sentto behavior evaluation block 206, which outputs attentional demandingfor indicating generation of an attention class. In response to this,learning block 208 generates an attention class. Thus learning result isalways updated; therefore, precision of the motion estimation isimproved, too.

Now the control process in practical application will be described ofthe behavior control apparatus of the invention installed on RChelicopter. FIG. 6 is a flowchart of the process. This chart can bedivided into two step showed as two dotted line rectangular in FIG. 6.One is coarse step of left side column where rough segregation oftarget/non-target is executed. The other is fine step of right sidecolumn where the target location is narrowed (refined) gradually.

At step 602, probabilistic density distribution P(Ω_(l)) for allattention classes Ω_(l) of motion are assumed to be uniform. At step604, the mobile unit moves randomly for collecting data for learning. Inthis example, data set collected for stabilizing the RC helicopter 100was used to generate 500 training data points and 200 test points.

At step 606, the CCA reinforced EM algorithm is executed for calculatingparameters θ (μ, Σ) which defines the probabilistic density distributionΩ_(l). In the present example, 20 subclasses was used at first, but thenumber of subclasses converges by CCA reinforced EM algorithm andfinally reduced to 3 as shown in FIG. 4.

At step 608, P(Q|Ω_(l)) is calculated with θ, where Q representsbehavior command. At step 610, probabilistic relation between featurevector I and attention class Ω₁ is calculated with neural network. Atstep 612, motion of the mobile unit is estimated by bayes' rule. Steps602 to 612 correspond to the learning stage.

At step 614, Gaussian mixture model is calculated with the use of eachprobabilistic density function. Part of the image which is not includedin Gaussian mixture model is separated as non-target.

At step 616, the target object is recognized by template matching andprobabilistic density distribution Ω_(TL) of the target location iscalculated. At step 618 the center of this is defined as targetlocation.

At step 620, difference D between center of motion (COM) and the targetlocation (TL) is calculated. At step 622, the map outputs behaviorcommand expanding the width of motion when the helicopter is far fromthe target location, otherwise outputs command reducing the width of themotion. FIG. 7 shows an example of output behavior command. As seen, amap is stored in memory which takes different output value depending onD and corresponding value is searched and transmitted to the servomotor.

At step 624, it is determined whether D is smaller than the allowableerror ε. If D is larger than ε, the accuracy of the target location isnot sufficient and the process returns to step 606 to re-calculate θ.That is, it attributes to the normalization problem how many number ofgaussian mixture function is need to estimate the state of motion. Byincreasing the applied number of mixture gaussian function every timethe process returns to step 606, the unit may estimate θ accurately andthus predict the target location accurately.

When D is smaller than ε at step 624, it shows that the helicopter isstable with sufficient accuracy for target location and so the processis terminated. By setting ε small, the unit may control both thelocation of helicopter and the duration during which the helicopterremains at that location. Steps 614 to 624 correspond to the behaviorcontrol stage.

Results

FIGS. 8 to 10 are graphs illustrating control status of the RChelicopter. In these graphs, horizontal axis represents the number oftrial and vertical axis represents the distance between center of motion(COM) and the target location (TL) when controlling the helicopter to bestable. Two dotted straight line in the graphs represent thresholdvalues ε to determine stability of the control. The value ε is set to0.1826 in the graphs.

FIG. 8 is graph of control immediately after the behavior control isinitiated. In this case, the distance D does not become lower than ε andthe vibration is still large, so the control is determined as to beunstable. As the target location is narrowed, the vibration becomessmaller (as FIG. 9). Finally, the control status becomes stable as shownin FIG. 10.

Some preferred embodiments have been described, but this invention isnot limited to such embodiments. For example, the behavior controlapparatus may not be installed on the mobile unit. In this case, onlythe CCD camera is installed on the mobile unit and the behavior controlapparatus is installed on another place. Then information is transmittedthrough wireless communication between the camera and the apparatus.

INDUSTRIAL APPLICABILITY

According to one aspect of the invention, the behavior control apparatusroughly segregate target area that includes a target object of behaviorfrom sensory inputs, such as images, based on the estimation of motion.The apparatus then specifies a target object from the target area,acquires location of the target object and output behavior command whichmoves the mobile unit toward the location. Thus, detailed feature of thetarget object need not be predetermined. In addition, because thefeatures irrelevant to present behavior are eliminated, thecomputational load is reduced. Therefore, highly efficient and accuratecontrol for the mobile unit may be implemented.

According to another aspect of the invention, the behavior controlapparatus pre-learns the relationship between sensory inputs andbehavior commands. Then the apparatus updates the learning result whennew feature is acquired on behavior control stage. The learning resultis represented as probabilistic density distribution. Thus, motion ofthe mobile unit on behavior control stage may be estimated with highaccuracy.

1. A behavior control apparatus wherein a target object to be used as acontrol reference is extracted from captured sensory inputs and behaviorof a mobile unit is controlled using a location of the target object asthe control reference, the apparatus comprising: sensory input capturingmeans for capturing sensory inputs including a target object; motionestimating means for estimating motion of the mobile unit; targetsegregation means for segregating from the sensory input the portionthereof in which the target object to be a target for behavior of themobile unit is located; target object matching means for extracting thetarget object from said segregated portion; target location acquiringmeans for acquiring the location of the target object; and behaviordecision means for deciding behavior command for controlling the mobileunit based on the location of the target object.
 2. The behavior controlapparatus claimed in claim 1, said motion estimating means comprising:behavior command output means for outputting said behavior command;behavior evaluation means for evaluating the result of the behavior ofthe mobile unit; learning means for learning the motion of the mobileunit using the relationship between said sensory inputs and saidbehavior result; and storage means for storing the learning result. 3.The behavior control apparatus claimed in claim 2, wherein said learningresult is probabilistic density distribution.
 4. The behavior controlapparatus claimed in claim 1, wherein said target segregation meanssegregates said portion by comparing the sensory inputs and saidestimated motion.
 5. The behavior control apparatus claimed in claim 4,wherein said segregation is done by utilizing optical flow.
 6. Thebehavior control apparatus claimed in claim 1 wherein said targetlocation acquiring means defines the center of the target object as thelocation of said target object; said behavior decision means outputs thebehavior command to move the mobile unit toward said location of thetarget object.
 7. The behavior control apparatus claimed in claim 6,wherein said behavior decision means calculates the distance between themobile unit and the location of said target object, said behaviordecision means deciding the behavior command to decrease the calculateddistance.
 8. The behavior control apparatus claimed in claim 7, whereinif the calculated distance is greater than a predetermined value, saidtarget segregation means repeats segregating said portion which includesa target object.
 9. The behavior control apparatus claimed in claim 1,wherein said sensory input capturing means captures images of theexternal environment of the mobile unit as the sensory inputs.
 10. Thebehavior control apparatus claimed in claim 1, wherein said targetobject matching means extracts target object by pattern matching betweenthe sensory inputs and predetermined templates.
 11. The behavior controlapparatus claimed in claim 1, wherein said sensory inputs capturingmeans is a gyroscope which captures motion of the mobile unit.
 12. Amethod for controlling behavior of a mobile unit using behavior commandwherein a target object to be used as a control reference is extractedfrom captured sensory inputs and behavior of the mobile unit iscontrolled using a location of the target object as the controlreference, the method, comprising the steps for: capturing sensoryinputs including a target object; estimating motion of the mobile unit;segregating the portion of the sensory inputs which includes a targetobject to be target for behavior of the mobile unit; extracting thetarget object from said segregated portion; acquiring the location ofsaid target object; and controlling the mobile unit based on thelocation of target object.
 13. The method claimed in claim 12, whereinsaid estimating step further comprises: outputting said behaviorcommand; evaluating the result of the behavior of the mobile unit;learning the motion of the mobile unit using the relationship betweensaid sensory inputs and said behavior result; and storing the learningresult.
 14. The method claimed in claim 13, wherein said learning resultis probabilistic density distribution.
 15. The method claimed in claim12, wherein said segregating is done by comparing the sensory inputs andsaid estimated motion.
 16. The method claimed in claim 15, wherein saidsegregation is done by utilizing optical flow.
 17. The method claimed inclaim 12, wherein center of the target object is defined as the locationof said target object; said behavior command being determined so as tomove the mobile unit toward said location of the target object.
 18. Themethod claimed in claim 17, wherein the distance between the mobile unitand the location of the center of said target object is calculated, andthen the behavior command is determined to decrease the calculateddistance.
 19. The method claimed in claim 18, wherein if the calculateddistance is greater than a predetermined value, said segregating step isrepeated.
 20. The method claimed in claim 12, wherein said sensoryinputs are images of the external environment of the mobile unit. 21.The method claimed in claim 12, wherein said extracting is done bypattern matching between the sensory inputs and predetermined templates.22. The method claimed in claim 12, wherein motion of the mobile unit iscaptured using a gyroscope.
 23. Computer program for implementingcomputer controlled behavior of a mobile unit by: capturing sensoryinputs including a target object; estimating motion of the mobile unit;segregating the portion of the sensory inputs which includes a targetobject to be target for behavior of the mobile unit from sensory inputs;extracting the target object from said segregated portion; acquiring thelocation of said target object; and controlling the mobile unit based onthe location of target object.
 24. A computer-readable recording mediumcontaining a program for controlling behavior of a mobile unit byimplementing a computer for: capturing sensory inputs including a targetobject; estimating motion of the mobile unit; segregating the portion ofthe sensory inputs which includes a target object to be target forbehavior of the mobile unit from sensory inputs; extracting the targetobject from said segregated portion; acquiring the location of saidtarget object; and controlling the mobile unit based on the location oftarget object.